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Abstract 

Mechanical unfolding of several domains of calmodulin and titin is studied using a 
Go-like model with a realistic contact map and Lennard-Jones contact interactions. 
It is shown that this simple model captures the experimentally observed differ- 
ence between the two proteins: titin is a spring that is tough and strong whereas 
calmodulin acts like a weak spring with featureless force-displacement curves. The 
difference is related to the dominance of the a secondary structures in the native 
structure of calmodulin. The tandem arrangements of calmodulin unwind simulta- 
neously in each domain whereas the domains in titin unravel in a serial fashion. The 
sequences of contact events during unraveling are correlated with the contact order, 
i.e. with the separation between contact making amino acids along the backbone in 
the native state. Temperature is found to affect stretching in a profound way. 
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1 Introduction 



Recent developments in teclinology liave enabled studies of large biomolecules 
through mechanical manipulation. The most common techniques od such a 
manipulation involve atomic force microscopy and optical tweezers. These 
techniques target the hydrophobic and hydrogen bond interactions that are 
an order of magnitude weaker than those corresponding to the covalent bonds. 
The simplest protocol used in the studies is to anchor one segment of a 
molecule to a substrate and pull by another segment at a constant speed, 
Vp. By monitoring the force of resistance to the pull, F, and plotting it versus 
the tip displacement, d, one obtains an elastic characterization of a molecule 
that requires theoretical interpretation in terms of the sequence of the rup- 
turing events. The resulting F — d plots have a fine structure consisting of 
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peaks, minima, and platcaiis that depend on the molecule and conditions of 
its environment. The first systems that were studied through stretching were 
the streptavidin-biotin complex [1,2], DNA [3,4,5], and the multi-domained 
titin [6,7,8] that is found in a class of sarcomers in muscles. 

Pulling apart two strands of the DNA involves breaking one hydrogen bond 
at a time and this makes the force ondulate around the value of 13 pN and 
with the amplitude not exceeding 1 pN [5]. The typicall pulling speed in these 
experiments is 40 nm/s and a five- fold increase in Vp affects the F — d pattern 
only very weakly. Separating biotin from streptavidin involves stretching many 
bonds simultaneously and, within the pulling distance of aboTit 10 A, the peak 
force is close to 300 pN [2]. Stretching of titin results in a sawtoothlike pattern 
[6,9,10,11] where each tooth is attributed to unwinding of a single domain and 
has a peak value of order 200 pN. Titin has been found to act fike a non- 
Hookean spring which is strong, tough and nearly reversible [9]. Its properties 
have inspired biomimetic design of polymers [12]. The calcium binding C2A 
protein, on the other hand, has been found to be much weaker: the peak force 
is only of order 60 pN and its sawtoothlike pattern has more structure within 
each tooth [13]. For poly-calmodulin, another calcium binding protein, there 
are no significant force peaks since no cluster of hydrogen bonds undergoes 
breaking [13] until the molecule is fully stretched. The patterns obtained for 
modular proteins generally depend on whether the modules are connected 
end-to-end or away from the terminals, like in the case of ubiquitin [14] or 
lyzozyme [15,16]. 

The F — d patterns appear to act like finger-prints of biomolecules but they 
also depend on the temperature, T, as evidenced experimentally [17] and the- 
oretically [18,16]. In this paper, we demonstrate that simple geometry based 
theoretical models can capture substantial difi'erences in the F — d curves that 
exist between proteins and show that these differences diminish on increasing 
the temperature and disappear in the entropic limit. Our presentation is fo- 
cused on two proteins: the 127 domain of titin and calmodulin corresponding 
to the Protein Data Bank [19] codes Itit and Icfc respectively. The former 
has the architecture of the /3 sandwich [20] with no a helices and with the (3 
content of 32.65 % whereas calmodulin is mostly an a protein that shown in 
figure 1: the hehcal content is 54.05 % and the /3 content is 8.11 %. 

There are obvious advantages to all-atom modelling compared to simplified 
coarse grained models: it offers a more realistic description and its nature is 
more fundamental. All atom modelling of Itit [21] leads to the identification of 
the hydrogen bonds linking the so called A' and G strands as being responsible 
for the maximum force in this case. There are also equally clear drawbacks 
that are related to the necessity of dealing only with very short time scales, 
typically of order nano-seconds. This results in considering the pulling speeds 
which are 6-7 orders of magnitude too rapid and which may be responsible 
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for an order of magnitude too big peak forces calculated for titin [21] (another 
reason for the discrepancy with the experiment could be the surface tension 
effects due to the droplet of water that surrounds the protein). The models 
presented here allow for studies that a) involve more realistic Vp, b) incorporate 
tandem connection of several domains, c) enable comparison to the kinetics 
of folding, d) deal with variations of parameters, such as the T, and d) easily 
compare various proteins. 



2 Model 

Protein folding is thought to be governed by the geometry of the protein 
[22,23,24,25] and especially by the geometry of its native state [26,27,28]. One 
way to incorporate geometry into the model is to follow the prescription of 
Go [29,30]: construct a Hamiltonian that incorporates the chain-like connec- 
tivity and which has a ground state that agrees with the experimentally deter- 
mined native conformation. Our realisation of this prescription within a coarse 
grained model that is studied through the techniques of molecular dynamics 
is outlined in references [31,32,33]. 

Briefly, the amino acids are represented by point particles of mass m located 
at the positions of the C" atoms. They are tethered by a strong harmonic 
potential with a minimum at 3.8 A. The interactions between the amino acids 
arc grouped into contacts of the native and non-native kinds. The distinction 
is based on taking the atomic representation of the amino acids in the native 
state and then checking for their posible overlaps. The occurrence of an overlap 
is determined assuming that the atoms take a spherical space corresponding 
to the van der Waals radii of the atoms, enlarged by the factor of 1.24 [34,35] 
to account for the soft part of the interaction potential. The amino acids {i 
and j) that are found to overlap in this sense are considered to be forming 
contacts. These pairs are endowed with the Lennard- Jones potential. 



such that its minimum agrees with the experimental value of the distance 
between the C" atoms in the native state. This condition selects a pair by 
pair value of the length parameter whereas the energy parameter e is kept 
uniform, e could be made specific if understanding regarding the values was 
reached. It corresponds to many effective non-covalent interactions, such as 
hydrophobicity and hydrogen bonds, so it should range between 800 and 2300 
K. It appears [18] that T = kBT/e of about 0.3, where ks is the Boltzmann 
constant, qualitatively reproduces the room temperature elastic behavior of 
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titin. 



The properties of the native contacts in the two proteins studied here are 
illustrated in Figure 2. There are 209 contacts (89 amino acids) in Itit and 
426 contacts (148 amino acids) in Icfc. For Icfc, 71 % of the contacts are 
local - their sequence distance does not exceed 4. This reflects the high a- 
content. In contrast, only 30 % of the contacts in Itit are local. The contact 
map for Itit, also shown in Figure 2, is organised in stripes corresponding to 
interactions between distinct /9-strands. Such patterns are not present in the 
case of Icfc suggesting a qualitatively different network of the couphngs. 

The thermal fluctuations away form the native state are mimicked in the 
molecular dynamics simulation by introducing the Langevin noise with the 
damping constant 7 of 2 m/r, where r is ^mcr^/e. This corresponds to the 
situation in which the inertial effects are negligible [33] but a more realistic 
account of the water environment requires 7 to be about 25 times larger [36]. 
Thus the times scales obtained for 7=2m/r need to be multiplied by 25 since 
a linear dependence on 7 has been found [31,32]. 

Stretching is implemented by attaching both ends of the protein to harmonic 
springs of spring constant A:=0.12e/A^, i.e. of order 0.4 N/m, which is typical 
for atomic force microscopy. The outer end of one spring is held constant 
whereas the outer end of the other is pulled along the initial end-to-end vector. 
Our results are shown for the pulling speed of 0.005 A/r which corresponds to 
7 X 10^ nm/s. Even though this speed is 3 orders of magnitude faster than in 
experiments, our previous studies [37,18,38] indicated only small logarithmic 
corrections on going to still smaller values of v^. 



3 The force-displacement curves 

Stretching of up to flve domains of titin within the Go model has been analysed 
in details in [18]. Here, we present the F — d patterns for titin to provide a 
reference for calmodulin. Figure 3 shows the F — d patterns obtained for one, 
two, and three domains of Itit, linked in tandem, at T = 0, i.e. when no 
thermal fluctuations are taken into account. The multidomain patterns are 
essentially a serial repeat of the single domain curve. The single domain curve 
has two major force peaks. The flrst of these has a height of nearly 4e/Aand it 
occurrs due to the unravelling of the links between the /3-strands that exist at 
the opposite terminals of the protein. These links are primarily between the 
strands A' (amino acids 11-15), A (amino acids 4-7) and G (amino acids 78- 
88). The second major peak is due to breaking the C-F and B-E links where 
B, C, E, and F strands correspond to segments 18-25, 32-36, 55-61, and 69-75 
respectively. The small hump on the rising side of the first major peak is due 
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to a rupture in the A-B region [39] and it corresponds to the intermediate 
state that was identified by Marszalek et aL [10] 



Figures 4 and 5 show the corresponding patterns for T of 0.3 and 0.6. The 
increase in T results in lowering of the force peaks and making them to occur 
earlier during the stretching. This is because the thermal fluctuations provide 
additional unravelling forces. In the entropic limit, reached around T of 0.8, 
there are no identifiable force peaks and the F — d curves are described by the 
featureless worm-like-chain model [18,16]. In this hmit, the domains unravel 
simultaneously. At intermediate temperatures, the unravelling is part serial 
and part parallel. At T=0.3, the stretching of several domains is predominantly 
serial in character and the F ~ d curves seem to be qualitatively similar to 
the saw-tooth patterns obtained experimentally [6,9]. Note that the second 
major peak force that has been clearly identified in the T—Q trace disappears 
at T = 0.3 except for a weak shadow of it in the first period of the serial 
pattern. 



The multidomain tandem arrangement is constructed so that the C-tcrminal 
of one domain is connected to the N-terminal of another by an extra C" — 
bond along the end-to-end direction in a single domain. 



The F — d curves for calmodulin shown in Figures 6 through 8 are the analogs 
of Figures 3 through 5 for titin and they demonstrate an entirely difi^erent 
behavior. The single domain curve at T=0 displays only minor force peaks. 
The biggest force (before the stage of fully stretched conformation is reached) 
is only about 1.5e/ A- less than 40 % of the maximum force found in titin. 
Furthermore, the multidomain curves are not serial repetitions of the single 
domain result. Instead, the particular features in the plot get effectively mul- 
tiplied in segments, indicating a large degree of parallelism in the unwinding. 
Our studies [40] of two helices connected in series have indicated that they 
unwind in parallel. Thus the behavior found for calmodulin echos this finding. 



An increase in the temperature causes effects which are consistent with the 
general scenario [16] - the peaks get lower and become less resolved. The in- 
teresting part is that, for calmodulin, the peaks at T—0.3 are so inconspicuous 
that the curves look as though the system was almost in the entropic/worm- 
like-chain limit. This limit is fully achieved at T=0.6, as demonstrated in Fig- 
ure 8. The nearly featureless nature of the experimental F — d curves taken 
at the room temperature [13] is consistent with the T = 0.3 finding based on 
the Go model. 
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4 Scenarios of unfolding for calmodulin 

Insights into unfolding can be obtained by looking at the snapshots of the 
process. Figure 9 shows the snapshots for Icfc at T—0. Similar to titin, the 
system starts unwinding by breaking bonds that connect the terminal seg- 
ments. However, the breakage involves a much smaller force. The next stage 
involves separation of helices which are parallel to each other, and finally un- 
winding of the helices themselves. Figure 10 shows that the stretching of two 
domains engages both of them from the early stages on. 

A convenient way to describe the unfolding process is by providing the 'sce- 
nario diagrams' in which distances, du at which specific contacts break are 
plotted against the contact order, i.e. against the sequential distance s — — 
between the contact making amino acids i and j. At T—0, there is a unique 
distance at which a contact breaks. At finite temperatures, contacts may re- 
form so a meaningful definition of du is through a distance at which the contact 
exists for the last time. The technical criterion for the existence of a contact 
is that the distance between the amino acids involved is less than 1.5 CTjj 
[31,32,18]. 

Figures 11 and 12 show the stretching scenarios for Icfc at T = and 0.3 
respectively. Six sets of interactions are represented by polygonal or circular 
symbols as indicated in the figures whereas the remaining interactions are 
marked by the crosses. For instance, the interactions between the 6-17 and 
65-75 amino acids in two a-helices are shown as the open circles and marked 
as H(6-17)-H(65-75). The other character symbols follows the conventions of 
the Protein Data Bank: E is an extended /?-strand and S is a bend. The two 
scenarios are rather similar to each. For instance, in both cases stretching is 
initiated in the terminal region indicated by CN at the high end of the contact 
order. However, the ordering of certain events is switched in time. For instance, 
the rupturing of the H(82-92)-H(102-lll) contacts (the triangles) at T=0 
takes place later than that of the H(6-17)-H(65-75) (the open circles) whereas 
at T = 0.3 the opposite ordering takes place. Furthermore, the T = 0.3 events 
are more noticeably accumulated into well defined stripes compared to the 
bigger scatter and single event resolution seen in the T—0 plot. 

We now focus on the room-temperature-like case of T=0.3 and consider the 
stretching scenarios for several domains of calmodulin. Figure 13 refers to the 
case of two domains. The star symbols denote the contacts in the first domain 
whereas the squares to the second domain. There is a clear intermixing of the 
symbols which indicates a non-serial character of unfolding. In particular, the 
contacts in the near-terminal (CN) regions unwind nearly simultaneously in 
both domains - in a marked contrast to what happens in two titin domains 
[18]. An even heavier mixing takes place in the case of three domains, as shown 
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in Figure 14. 

In summary, tiie Go-like models can capture the experimentally found differ- 
ence in the elastic behavior between calmodulin and titin and can elucidate the 
microscopic picture of the events in stretching. The force- displacement pat- 
terns are sensitive to temperature and acquire the worm-like-chain behavior 
in the cntropic limit. Unwinding of modular proteins does not need be serial 
in nature and calmodulin provides a clear example of such a non-seriality. Its 
source is the mostly a character of the protein. 

The Author thanks T. X. Hoang and M. O. Robbins for discussions and collab- 
oration. He also thanks P. E. Marszalek for his comments on the manuscript. 
This work was funded by Pohsh Ministry of Science, project 2 P03B 032 25, 
and the European program IP NAPA through Warsaw University of Technol- 
ogy- 
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Fig. 1. The backbone representation of the Icfc structure of calmodulin. 
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Fig. 2. The distribution of sequential distances in the native contacts. The sohd 
hne is for Itit and the broken hue is Icfc. The bin sizes generally correspond to the 
distance of 10 except for the very first bin which counts contacts of length smaller 
or equal to 4 and the second bin which counts contacts with the distance bigger 
than 4 but not exceeding 10. The inset shows the corresponding contact maps. The 
contact maps are symmetric and only a half is shown for each protein. 
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Fig. 5. Same as in Figure 3 but for T=0.6. 
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Fig. 7. Same as in Figure 6 but for T = 0.3. 
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Fig. 8. Same as in Figure 6 but for T = 0.6. 
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Icfc 
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Fig. 9. The conformations of Icfc during unfolding at r=0. The labels on the right 
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Fig. 10. Same as in Figure 9 but for two Icfc domains connected in tandem. 
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Fig. 12. Same as in Figure 11 but for T = 0.3. The data points are for a single 
trajectory. 
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Fig. 13. The scenario diagram for two domains of Icfc at T = 0.3. 
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Fig. 14. The scenario diagram for three domains of Icfc at T = 0.3. 
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